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Abstract 

Decision  making  in  military  medical  evacuation  (MEDEVAC)  of  casualties  consists  of  identi¬ 
fying  which  MEDEVAC  asset  to  dispatch  in  response  to  a  casualty  and  which  medical  treatment 
facility  to  transport  the  casualty,  both  of  which  contribute  to  the  likelihood  of  casualty  survival. 
These  decisions  become  complicated  when  MEDEVAC  assets  and  medical  treatment  facilities 
are  distinguishable  and  casualties  are  prioritized  as  life-threatening  and  non  life-threatening.  In 
this  paper,  an  undiscounted,  infinite  horizon  Markov  decision  process  model  is  developed  that 
examines  the  interrelated  decisions  of  how  to  optimally  dispatch  MEDEVAC  assets  to  calls  for 
service  and  transport  casualties  to  medical  treatment  facilities.  The  model  accounts  for  errors 
made  during  triage  of  casualties  to  investigate  the  revelation  of  information  over  time  and  al¬ 
lows  for  batch  arrival  of  casualties  to  the  system.  The  MDP  is  solved  with  a  value  iteration 
algorithm.  The  optimal  policy  is  compared  to  three  heuristic  casualty  transport  policies. 

Keywords:  Markov  decision  process,  military  medical  evacuation  systems,  triage 


1  Introduction 


Effective  medical  evacuation  (MEDEVAC)  of  wounded  soldiers  (casualties)  in  military  operations 
is  important  to  the  survivability  of  the  combat  soldier  (Zinder,  2007).  Transporting  casualties  to  a 
medical  treatment  facility  in  a  timely  manner  prevents  the  deteriorating  health  and  potential  death 
of  casualties.  The  effective  MEDEVAC  of  casualties  also  contributes  to  the  potential  psychological 
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advantage  for  those  participating  in  combat,  who  understand  that  medical  assistance  will  come 
quickly  once  requested  (Bastian  et  ah,  2012). 

While  this  paper  focuses  on  a  model  configuration  for  the  United  States,  the  model  is  applicable 
to  other  countries.  Effective  evacuation  of  casualties  is  an  important  problem  shared  across  all 
countries  that  support  combat  troops.  Moreover,  issues  examined  in  this  paper,  such  as  imperfect 
initial  triage,  the  collection  of  new  information  over  time,  and  medical  guidelines  for  transporting 
military  casualties,  are  generally  shared  across  countries,  as  evidenced  by  a  recent  MEDEVAC 
summit  held  in  London,  England  in  October  2013  (MEDEVAC  Summit,  2013). 

This  paper  focuses  on  the  dispatch  and  transport  of  causalities  in  United  States  military  sys¬ 
tems.  Casualties  arrive  as  calls  for  service,  where  dispatchers  interpret  the  call  detailing  a  casualty 
event  and  make  a  resource  allocation  decision  regarding  which  MEDEVAC  asset  to  dispatch  to  the 
casualty  and  later  select  which  medical  treatment  facility  to  transport  the  casualty  (Bozell,  2013). 
The  MEDEVAC  asset  dispatched  also  transports  the  casualty  to  the  medical  treatment  facility 
(i.e.,  a  different  MEDEVAC  asset  would  not  transport  the  casualty  based  on  additional  informa¬ 
tion  collected  at  the  scene)  and  therefore,  the  dispatch  and  transport  decisions  are  interrelated. 

Identifying  effective  policies  for  dispatching  air  MEDEVAC  assets  and  transporting  casualties 
can  be  counter-intuitive.  A  fleet  of  potential  air  MEDEVAC  assets  are  distinguishable  by  their  base 
location,  and  therefore  articulate  different  response  times  to  casualties.  Likewise,  medical  treatment 
facilities  are  distinguishable  by  both  the  capable  level  of  care,  i.e.,  a  role  2  medical  treatment  facility 
versus  a  role  3  medical  treatment  facility,  and  the  proximity  of  the  medical  treatment  facility  to 
the  casualty  location.  Eurther,  there  exists  a  triage  scheme  within  military  evacuation  systems,  in 
which  the  categories  used  to  rank  injuries  for  precedence  in  evacuation  are  as  follows  (Bozell,  2013): 

•  “CAT  A” ;  Alpha  category  includes  urgent  casualties  that  need  to  be  treated  within  one  hour. 

•  “CAT  B”:  Bravo  category  includes  priority  casualties  that  need  to  be  treated  within  four 
hours. 

•  “CAT  C” :  Charlie  category  includes  routine  casualties  that  need  to  be  treated  within  twenty- 
four  hours. 

The  evacuation  triage  system  lends  itself  to  sub-categorizing  casualties  based  upon  priority.  Eor 
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example,  a  system  that  identifies  only  high-priority  and  low-priority  casualties,  the  calls  for  service 
which  have  been  categorized  as  CAT  A  could  be  seen  as  high-priority,  while  all  other  calls  for 
service  could  be  seen  as  low-priority.  An  alternative  classification  would  be  treating  both  the  CAT 
A  and  CAT  B  calls  for  service  as  high-priority  and  the  CAT  C  calls  for  service  as  low-priority.  The 
prioritization  scheme  can  be  generalized  for  systems  with  more  than  two  priority  levels. 

Military  medical  evacuation  systems  aim  to  transport  CAT  A  casualties  to  a  medical  treatment 
facility  within  one  hour,  a  practice  commonly  known  as  the  Golden  Hour  (Bozell,  2013).  The 
fundamental  idea  of  the  Golden  Hour  is  that  mortality  is  least  likely  to  occur  if  initial  treatment 
of  a  severe  trauma  casualty  begins  within  one  hour  post  injury.  A  military  evacuation  system  is 
evaluated  against  the  Golden  Hour  standard  to  increase  survivability  of  the  most  urgent  GAT  A 
casualties.  Improving  the  logistics  of  MEDEVAC  systems  to  meet  the  Golden  Hour  standard  is  an 
important  problem  found  frequently  in  the  popular  press  (Pahon  (2012);  Doane  (2011);  Shinkman 
(2013)). 

The  Golden  Hour  performance  measure  evaluates  time  until  treatment  of  a  casualty,  and  it  is 
in  contrast  with  performance  measures  used  by  civilian  emergency  medical  systems  (EMS).  Nearly 
all  civilian  EMS  systems  evaluate  performance  according  to  response  times  as  opposed  to  casualty 
delivery  times  (McLay,  2010).  As  a  result,  nearly  all  research  in  civilian  systems  focuses  on  triage 
accuracy  and  initial  dispatch  decisions.  However,  the  importance  of  triage  on  resource  allocation 
decisions  is  well-documented  area  in  civilian  systems  (see  Clawson  et  ah,  1999;  Dunford,  2002),  and 
this  issue  becomes  more  complex  in  military  MEDEVAC  systems  because  dispatch  and  transport 
decisions  are  interrelated  and  more  accurate  information  is  collected  at  the  scene. 

This  paper  formulates  a  Markov  decision  process  (MDP)  model  to  solve  a  MEDEVAC  asset 
dispatching  and  casualty  transporting  problem  with  two  interrelated  types  of  decisions:  first,  how 
to  initially  dispatch  location-dependent  air  MEDEVAC  assets  to  location-dependent  casualties, 
and  second,  how  to  identify  distinguishable  hospitals  to  transport  the  casualties.  Both  types  of 
decisions  indirectly  affect  the  high-priority  casualty’s  likelihood  of  survival,  which  is  dependent 
upon  time  until  treatment  in  a  medical  treatment  facility  (Cunningham  et  ah,  1997).  To  gain 
insight  into  military  medical  evacuation  systems,  the  MDP  model  determines  how  to  maximize  the 
long-run  average  Golden  Hour  reward  over  the  truly  high-priority  casualties  while  also  providing 
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timely  evacuation  to  low-priority  casualties.  The  MDP  model  allows  for  classification  errors  in  the 
initial  triage,  in  which  a  truly  low-priority  casualty  may  be  initially  classified  as  high-priority,  and 
vice-versa,  thus  leading  to  dispatch  decisions  with  imperfect  information.  However,  upon  arrival 
at  the  scene,  it  is  assumed  that  the  medics  from  the  responding  air  MEDEVAC  asset  accurately 
diagnose  the  severity  of  each  call  thus  make  transport  decisions  to  the  medical  treatment  facility 
with  perfect  information.  The  MDP  model  also  accounts  for  batch,  or  multiple,  casualties  in  a  call 
for  service. 

This  paper  is  organized  as  follows.  Section  2  provides  a  literature  review  on  military  MEDEVAC 
asset  optimization  as  well  as  dispatching  and  transporting  models  in  the  operations  research  litera¬ 
ture.  Section  3  outlines  the  novel  MDP  model.  A  computational  example  of  the  U.S.  configuration 
is  included  in  Section  4.  Concluding  remarks  and  future  work  are  given  in  Section  5. 

2  Background 

There  are  a  number  of  existing  military  research  papers  related  to  this  effort.  Higgins  (2010)  intro¬ 
duces  the  role  and  capabilities  of  U.S.  Army  MEDEVAC  helicopters  by  providing  an  assessment  of 
the  operational  issues.  Operational  issues  of  helicopters  are  pivotal  in  any  research  study  of  casu¬ 
alties  and  medical  evacuation  systems,  due  to  the  speed  of  response  dictating  survival  likelihood. 
Several  models  focus  on  locating  assets.  Bastian  (2010)  presents  a  multi-criteria  decision  analy¬ 
sis  model  to  determine  the  minimum  number  of  MEDEVAC  helicopters  needed  at  each  medical 
treatment  facility  to  maximize  the  coverage  of  the  theater-wide  casualty  demand,  while  minimizing 
the  maximal  medical  treatment  facility  evacuation  site  total  vulnerability  to  enemy  attack.  Zeto 
et  al.  (2006)  also  seeks  to  maximize  the  theater-wide  casualty  demand  coverage,  by  examining  the 
pre-location  of  air  MEDEVAC  assets,  along  with  type  and  quantity,  while  balancing  MEDEVAC 
asset  reliability.  Fulton  et  al.  (2009)  introduce  a  two  stage  stochastic  optimization  modeling  frame¬ 
work  for  the  medical  evacuation  of  casualties,  which  identifies  optimal  casualty  evacuation  sites 
and  medical  treatment  facility  sites  in  response  to  stochastic  demands  for  service.  In  contrast  to 
asset  emplacement  strategy,  this  paper  considers  dispatching  and  transporting  decision-making  to 
maximize  a  Golden  Hour  utility  function. 
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Bastian  et  al.  (2012)  examines  the  required  capabilities  of  medical  evacuation  platforms  of  the 
future  U.S.  MEDEVAC  platforms,  including  identifying  the  zero-risk  aircraft  ground  speed.  Bastian 
et  al.  (2013)  further  examine  three  research  issues  surrounding  future  U.S.  MEDEVAC  platforms, 
including  optimal  operational  capabilities,  trade-off  considerations  of  different  aircraft  engines,  and 
the  effect  of  weaponizing  the  current  MEDEVAC  asset  fleet  on  range,  coverage  radius,  and  response 
time.  While  Bastian  et  al.  (2012,  2013)  evaluate  competing  objectives  of  future  casualty  evacuation 
systems,  this  paper  focuses  on  tactical  issues  such  as  real-time  dispatch  and  transportation  issues, 
two  issues  that  have  been  overlooked  in  the  military  MEDEVAC  optimization  literature.  Therefore, 
the  model  in  this  paper  provides  a  unique  contribution  to  the  military  MEDEVAC  optimization 
literature  by  examining  the  inter-related  dispatching  and  casualty  transporting  decisions  given  that 
there  are  errors  in  initial  triage. 

Military  and  civilian  emergency  service  systems  are  similar  in  nature  as  both  systems  deal 
with  the  transportation  of  time-sensitive  customers/patients  to  higher  level  medical  care  facilities. 
Further,  both  emergency  service  systems  have  high  and  low  prioritized  customers,  a  complexity 
that  makes  resource  allocation  decisions  difficult.  To  improve  response  and  transport  times  to 
the  truly  most  critical  patients,  it  is  important  to  understand  when  to  dispatch  the  closest  server 
versus  when  to  ration  that  asset  instead.  McLay  and  Mayorga  (2013)  present  a  MDP  model  for 
dispatching  servers  to  spatially-distributed  patients  that  maximizes  the  fraction  of  patients  who  are 
responded  to  within  a  fixed  time  frame  while  allowing  for  the  possibility  of  classification  errors  in 
initial  patient  classification.  This  paper  is  similar  to  McLay  and  Mayorga  (2013)  as  both  develop  a 
MDP  model  and  consider  potential  classification  errors  in  initial  classification.  However,  they  differ 
in  that  this  paper  evaluates  the  impact  of  additional  information  that  becomes  available  over  time 
during  the  response  to  and  treatment  of  a  casualty  as  well  as  its  impact  on  transport  decisions. 

Several  other  papers  have  examined  dispatch  issues  for  civilian  EMS  and  fire  departments. 
Jarvis  (1975)  introduces  a  Markov  decision  process  for  determining  optimal  dispatching  policies  for 
a  single  type  of  server.  Swersey  (1982)  develops  a  Markov  model  for  determining  how  many  fire 
engines  to  send  to  prioritized  fire  calls  that  balances  the  costs  associated  with  dispatching  too  few  or 
too  many.  Ignall  et  al.  (1982)  extend  this  model  to  account  for  calls  and  fire  engines  that  are  spatially 
distributed,  and  they  provide  a  “preparedness”  heuristic  rather  instead  of  exploring  an  optimal 
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solution.  Both  Andersson  and  Varbrand  (2007)  and  Lee  (2011)  propose  similar  “preparedness” 
heuristics  for  dispatching  ambulances  to  calls.  A  related  stream  of  literature  focuses  on  spatial 
queuing  models  and  approximations  that  describe  dispatching  dynamics  rather  than  prescribing 
dispatch  decisions  (Larson,  1974,  1975;  Budge  et  ah,  2009;  Jarvis,  1985). 

Emergency  medical  service  systems  identify  which  hospital  to  transport  customers/patients.  In 
the  civilian  side,  the  patient  or  protocols  from  the  medical  director  dictate  to  which  hospital  an 
ambulance  takes  a  patient,  and  therefore,  there  are  rarely  choices.  Shunko  et  al.  (2011)  explores 
hospital  transport  decisions  using  game  theory  in  the  context  of  two  competing  hospitals  that  can 
send  delay  signals  to  turn  away  incoming  ambulances,  a  situation  that  does  not  arise  in  military 
medical  systems. 

In  summary,  this  paper  is  distinct  from  the  existing  civilian  emergency  service  systems  literature, 
because  of  its  consideration  of  batch  arrivals,  prioritized  casualties,  and  and  the  inclusion  of  casualty 
transport  in  the  modeling  framework. 

3  Markov  decision  process  model 

This  section  presents  the  MDP  model  for  dispatching  air  MEDEVAC  assets  and  transporting 
prioritized  casualties  in  a  military  medical  evacuation  system  with  imperfect  information  during 
triage.  The  model  parameters  depend  on  the  elapsed  time  during  the  treatment  of  each  casualty, 
because  military  medical  evacuation  systems  are  evaluated  by  the  transfer  of  high-priority  casualties 
to  the  medical  treatment  facility  before  the  Golden  Hour.  There  are  seven  time  steps  to  a  military 
medical  evacuation  (Bastian,  2010): 

1.  Notification  time  (call  arrival). 

2.  MEDEVAC  asset  departure  time  (“wheels  up”). 

3.  Arrival  at  the  scene. 

4.  Departure  from  the  scene. 

5.  Arrival  at  the  medical  treatment  facility. 

6.  Transfer  of  the  casualty  to  medical  treatment  facility. 

7.  Arrival  at  MEDEVAC  asset  home  location  (return  to  service). 
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Figure  1:  Time  line  during  military  medical  evacuation 

Figure  1  presents  the  four  time  intervals  used  throughout  the  remainder  of  this  paper.  Response 
time  is  the  length  of  time  from  departure  time  (2)  to  the  MEDEVAC  asset  arrival  at  the  scene 
(3).  Service  time  is  the  length  of  time  from  departure  time  (2)  to  leaving  the  scene  (4).  Transport 
time  is  length  of  time  from  the  MEDEVAC  asset  leaving  the  scene  (4)  to  returning  to  its  home 
station  after  transporting  a  casualty  (7).  Transfer  time  is  the  length  of  time  from  injury  (1)  to  the 
casualty  being  transferred  to  a  medical  facility  (6). 

The  input  parameters  of  the  MDP  model  are  summarized  next,  followed  by  the  system  dynamics. 

n  =  the  number  of  casualty  locations, 

m  =  the  number  of  air  MEDEVAC  assets,  each  at  a  fixed  home  location, 

d  =  the  number  of  medical  treatment  facilities, 

R  =  the  classified  risk  level  during  triage,  with  R  £  {H,L},  where  H  (L)  denotes  classified 
high-risk  (low-risk), 

r  =  the  true  risk  level,  with  r  G  where  H'  [L')  denotes  truly  high-risk  (low-risk), 

A  =  the  call  arrival  rate, 

X  =  random  variable  representing  the  number  of  casualties  X  £  {1,  2,  . . . ,  N}  arriving  in  a  batch 
arrival, 

=  the  conditional  probability  that  a  batch  call  for  service  with  X  casualties  arrives  at  location 
i,  given  that  a  call  arrives,  i  =  1,2,  ...n,  X  =  1,2, 
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^R\i  ~  conditional  probability  that  a  batch  call  for  service  with  X  casualties  arrives  at  lo¬ 
cation  i  has  classified  risk  level  R  G  given  that  a  call  arrives,  i  =  1,  2,  . . . ,  n, 

X  =  l,  2,  IV, 

=  the  conditional  probability  that  a  batch  call  for  service  with  X  casualties  and  classified 
risk  level  R  G  {H,  L]  has  true  risk  level  r  G  {H\  L'},  i  =  1,  2,  . . . ,  n, 

=  the  expected  service  time  when  MEDEVAC  asset  j  responds  to  a  batch  call  for  service  with 
X  casualties  at  location  i,  i  =  1,2,...,  nandj  =  1,  2,  . . . ,  m,  A  =  1,  2,  . . . ,  A", 

^ijk  =  the  expected  transport  time  when  MEDEVAC  asset  j  transports  a  batch  of  casualties  from 
location  i  to  medical  treatment  facility  k  where  j  =  1,  2,  . . . ,  m,  i  =  1,  2,  . . . ,  n,  and  A:  = 
1,  2,  ...,  d, 

ufjkr  —  the  expected  utility  when  MEDEVAC  asset  j  transports  a  batch  of  X  casualties  with 
true  risk  level  r  G  {H\  V}  from  location  i  to  medical  treatment  facility  k,  where  j  = 
1,  2,  . . . ,  m,  i  =  1,  2,  . . . ,  n,  X  =  1,  2,  . . . ,  V,  and  k  =  1,  2,  . . . ,  d. 

The  state  variable  reflects  the  positions  of  each  of  the  m  MEDEVAC  assets  and  thus  can  be 
represented  by  the  m-dimensional  vector  s,  with  s{t)  =  (si,  S2,  •  •  • ,  Sm)-  To  describe  the  state  space 
in  a  succinct  way,  we  describe  all  possible  values  that  each  component  of  the  state  space  can  take.  In 
state  s{t),  MEDEVAC  asset  j  has  three  possible  types  of  values  corresponding  to  the  three  possible 
events  in  the  system  1)  asset  j  can  be  sent  to  a  call  for  service  that  arrives,  while  servicing  this  call 
at  the  scene,  Sj  is  described  by  a  call  location  i,  a  classified  priority  R,  and  a  batch  size  X,  2)  asset 
j  finishes  service  at  the  scene  of  service  and  begins  transporting  casualties  to  a  medical  treatment 
facility  denoted  by  and  3)  a  busy  MEDEVAC  asset  becomes  free  and  returns  to  its  home 
location  {sj  =  0).  Note  that  these  possible  values  for  Sj — {i,R,X),  D^,  0 — are  all  distinct  values 
that  are  mapped  to  integers  in  the  computational  implementation  of  the  value  iteration  algorithm 
used  to  solve  the  model.  The  total  number  of  states  is  equal  to  (1  -|-  2Nn  +  nd)'^.  Although  the 
MDP  model  suffers  from  the  so-called  “curse  of  dimensionality,”  we  will  explore  conditions  under 
which  the  state  space  can  be  made  smaller  in  Section  4. 

Only  one  component  of  the  state  variable  changes  after  an  event  occurs  at  time  t.  Therefore, 
s{t  -|-  1)  =  s{t)  except  for  component  Sj  in  s{t  -|-  1).  Let  (j)  denote  the  new  value  of  Sj  in  state 
s{t  +  1),  either  {i,R,X),  D^,  or  0.  Let  the  transition  function  s{t  +  1)  =  S^{s{t)\sj  =  cp)  capture 
the  new  state  at  time  t  -|-  1 . 


The  following  assumptions  are  made  in  the  model.  First,  if  a  call  for  service  arrives,  an  available 
MEDEVAC  asset  must  be  dispatched  to  the  casualty  if  any  are  available.  Otherwise  the  call  is 
assumed  lost  to  our  system.  This  assumption  is  acceptable  because  in  practice,  military  systems 
leverage  other  assets  to  treat  these  casualties  (Bozell,  2013).  Second,  service  cannot  be  preempted 
and  air  MEDEVAC  assets  cannot  be  rationed  in  expectation  of  in-coming  calls  for  service.  Third, 
a  MEDEVAC  asset  selects  a  medical  treatment  facility  destination  immediately  prior  to  departing 
from  the  scene,  and  immediately  after  reassessing  the  casualty  risk  to  obtain  the  true  risk  level 
r.  Therefore,  transportation  of  casualties  is  made  with  information  of  the  true  risk  level  r  G 
{if',  L'}.  This  can  be  contrasted  with  the  dispatch  decision  which  is  made  with  the  potentially 
inaccurate  triage  classification.  Thus,  the  interrelated  decisions  of  dispatch  and  transport  capture 
the  revelation  of  information  of  each  casualty’s  risk  level  over  time.  Fourth,  the  MEDEVAC  asset 
that  responds  to  the  casualty  must  transport  the  casualty  to  the  medical  treatment  facility.  Fifth, 
batch  arrivals  of  casualties  at  a  location  can  be  transported  by  a  single  responding  MEDEVAC 
asset,  that  is  the  capacity  of  an  air  MEDEVAC  asset  is  greater  than  or  equal  to  the  number  of 
casualties  in  a  batch.  This  assumption  is  reasonable,  since  in  practice,  the  capacity  of  MEDEVAC 
asset  is  larger  enough  to  transport  virtually  all  batched  casualties  that  arrive  (Bastian,  2010). 
Lastly,  risk  levels  are  assessed  on  a  batch  level,  not  a  casualty  level.  A  single  asset  responds  to  a 
batch,  not  individual  casualties,  and  therefore,  risk  levels  assigned  on  the  batch  level  is  practical 
and  easier  to  operationalize. 

The  objective  of  the  MDP  model  is  to  determine  which  MEDEVAC  asset  to  dispatch  to  a 
casualty  and  identify  which  medical  treatment  facility  to  transport  a  casualty,  for  each  state  in 
the  state  space,  so  that  the  expected  number  of  truly  high-priority  calls  that  arrive  at  a  medical 
treatment  facility  within  one  hour  per  stage  is  maximized. 

The  optimality  equations  for  the  infinite-horizon,  average  cost  MDP  model  are  given  next,  where 
I{sj=(i,R,x)}j  I{sj=D^.}^  and  /{s^.=o}  are  indicator  functions  representing  MEDEVAC  asset  j  is  serving 
X  casualties  at  location  i,  traveling  to  medical  treatment  facility  k,  and  being  idle,  respectively.  An 
infinite-horizon  MDP  model  with  steady-state  parameters  is  appropriate  because  of  the  duration 
of  military  operations.  We  use  uniformization  to  convert  a  continuous  time  MDP  model  into  an 
equivalent  discrete  time  MDP  model.  To  apply  uniformization,  the  maximum  rate  of  transitions  is 
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determined  to  be  7  =  A  +  where  I3j  =  max  |maxj^x  |  ^  |  ,  max*  ^  |  1 1  ,  j  =  1 . . .  m. 

Note  that  g  is  the  optimal  average  utility  per  stage  and  vt{s{t))  is  a  relative  value  function  in  state 
s{t)  =  (si,  S2,  •  •  • ,  Sm);  and  Ai(s(t))  and  A2{s{t))  represent  the  set  of  dispatching  and  transporting 
actions  available  in  state  s{t)  during  iteration  t,  respectively. 


k=l j=l i=l 


n  N 

+  J2Y1  Y1  max  =  (i,it,X)))} 

»=ix=ifl6{H.L}  jeau.W) 


n  N  m 


P^RDi  max  =  Dk))  +  'T'ufjkr} 

i^l  X  =  1  j^l  Re{H,L}rG{H',L'}  2(s(JJ 


n  N  m 


(Mu)  ^h‘!j=(i,R,X)}  -  EEE(^  ijk)  ^{sj^D,.}  (Id) 

\  i=l  X  =  1  j  =  l  R£{H,L}  i=lj  =  lfe  =  l  J 


The  four  lines  in  the  value  functions  (1)  represent  the  three  events  that  result  in  a  change  in  the 
state  variable  plus  the  fourth  “event”  that  nothing  changes  between  stages  (line  (Id)).  The  first 
line  (la)  accounts  for  the  event  of  busy  air  MEDEVAC  assets  completing  service  after  transporting 
a  casualty,  where  {dijk)~^  captures  the  mean  transport  time  and  v{S^ {s{t)\sj  =  0))  represents  the 
value  of  the  state  when  MEDEVAC  asset  j  returns  home  and  is  available  for  service.  Line  (lb) 
accounts  for  the  dispatch  of  a  MEDEVAC  asset  to  a  batch  of  casualties  of  size  X  that  arrives  to 
the  system  with  classified  priority  R.  Here,  captures  the  probability  of  a  call  for  service 

of  X  casualties  at  casualty  location  i  and  of  classified  risk  level  R  arrives  to  the  system.  The 
second  part  of  line  (lb)  selects  the  MEDEVAC  asset  j  to  the  incoming  call  for  service  {i,R,X) 
that  maximizes  the  value  of  dispatching  a  MEDEVAC  asset.  Line  (Ic)  accounts  for  the  decision 
to  transport  a  batch  of  casualties  to  a  medical  treatment  facility,  which  occurs  when  service  at  the 
scene  is  completed.  Here,  represents  the  mean  service  time  at  the  scene,  and  captures 

the  conditional  probability  of  the  casualty’s  true  risk  level  r  given  its  classified  as  risk  level  R  and 
location  i.  The  second  part  of  line  (Ic)  selects  the  medical  treatment  facility  that  maximizes  the 
value  of  transporting  casualties  to  a  medical  treatment  facility,  where  represents  the  reward 
received  when  MEDEVAC  asset  j  transports  X  casualties  at  casualty  location  i  of  true  risk  level  r 
to  medical  treatment  facility  k.  Appendix  A  summarizes  a  value  iteration  convergence  algorithm 
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using  the  corresponding  Ai-stage  finite-horizon  optimality  equations  (see  Puterman,  1994). 

Lastly,  the  assessment  of  the  severity  of  the  calls  is  imperfect  during  triage,  resulting  in  possible 
mismatches  between  the  classified  risk  level  R  G  {H,  L}  and  the  true  risk  level  r  G  {H\  R}  as 
captured  by  the  parameters.  The  accuracy  of  triage  classihcation  is  assumed  known,  such 

as  from  past  system  performance  data.  Since  military  medical  evacuation  systems  are  evaluated 
according  to  the  response  to  casualties  that  are  truly  the  most  critical  {H'),  it  is  of  interest  to 
match  the  classihed  risk  levels  R  to  the  true  risk  levels  r.  Let  a  denote  the  ratio  of  the  proportion 
of  classified  high-risk  casualties  that  are  truly  high-risk  to  the  proportion  of  classified  low-risk 
casualties  that  are  truly  high-risk. 


pX 

Therefore,  can  be  interpreted  as  the  accuracy  of  the  triage  for  high-priority  casualties,  which  we 
assume  is  independent  of  the  casualty  location.  When  =  1.0,  the  classified  high-risk  casualties 
are  at  least  as  likely  to  be  truly  high-risk  as  classified  low-risk  casualties.  As  — )•  oo,  the  set 

of  truly  high-risk  casualties  is  a  subset  of  classified  high-risk  casualties  (when  Ph'  <  Ph)-  In  the 
MDP  model,  input  parameter  is  a  function  of  ,  and  can  be  computed  as  follows.  Note 


that  since  triage  accuracy  is  independent  of  the  casualty  location.  Rearranging  and 

applying  Bayes  rule  yields: 


P- 


X 


a^P^, 

pX 

^Lri 


'Hr\i\h)- 


Rearranging,  noting  that  =  P^un  and  applying  Bayes  rule  again  yields: 


X  tdX 


pX 


a^P. 


H'\i 


The  analogous  procedure  can  be  applied  to  classihed  low-risk  calls,  yielding 

Appendix  B  contains  theoretical  results  related  to  transportation  policies  in  the  MDP  model 
proposed  in  this  paper.  The  hrst  result  indicates  that  if  the  expected  times  to  transfer  a  casualty 
at  two  medical  treatment  facilities  are  the  same,  it  is  optimal  to  transport  to  the  medical  treatment 
facility  with  the  highest  utility.  The  second  result  indicates  that  if  the  utilities  for  transferring  a 
casualty  at  two  medical  treatment  facilities  are  the  same,  it  is  optimal  to  transport  to  the  medical 
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treatment  facility  with  the  shortest  expected  time  until  transfer.  However,  these  results  are  not 
actionable  when  there  are  trade  offs  between  transfer  time  and  quality  of  care.  Therefore,  we 
examine  the  trade  offs  in  the  computational  example  in  the  following  section. 

4  Computational  example 

4.1  Problem  setup 

Consider  a  military  medical  evacuation  system  example  in  support  of  an  U.S.  Army  brigade,  where 
the  location  of  casualties  to  be  evacuated  and  medical  treatment  facilities  are  both  known.  As 
described  in  Bastian  et  al.  (2012),  the  area  of  operations  for  future  U.S.  Army  brigades  (a  military 
unit  with  over  3,000  personnel)  is  300  km^,  and  a  sub-area  of  30  km^  containing  four  air  MEDEVAC 
assets  and  four  casualty  locations  (m  =  n  =  4)  is  proposed  for  analysis  here  (see  Eigure  2). 


30  km 


Eigure  2:  Geography  of  military  medical  evacuation  system  with  4  casualty  locations  and  4  air 
MEDEVAC  assets 

Each  square  in  Figure  2  is  15  kilometers  long  and  15  kilometers  wide.  Travel  times  are  computed 
using  the  Euclidean  distance  between  locations  and  MEDEVAC  assets  have  a  flight  speed  of  155 
nautical  miles  per  hour  (knots)  Bastian  (2010). 

There  are  two  distinguishable  medical  treatment  facilities  (i.e.,  d  =  2)  available  in  support  of 
the  Army  brigade — the  first  is  a  role  2  medical  treatment  facility  denoted  k  =  2,  and  the  second  is 
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a  role  3  medical  treatment  facility  denoted  k  =  3.  The  utilities  are  set  based  on  the  time 

it  takes  for  asset  j  to  transfer  a  casualty  at  location  i  to  medical  treatment  facility  k.  Suppose  this 
transport  time  takes  t  hours,  on  average,  to  a  role  2  medical  treatment  facility.  Then,  the  modified 
Golden  Hour  utility  function  as  a  function  of  t  is  max{^  +  1,0}.  The  utilities  for  transporting  a 
casualty  to  a  role  3  medical  treatment  facility  is  assumed  to  be  a  factor  of  T  increase  over  the  role 
2  utility.  Moreover,  we  assume  the  utility  for  transporting  true  low-priority  casualties  is  zero,  since 
these  casualties  are  expected  to  survive  regardless  of  where  they  are  transported. 

We  focus  on  the  disparity  in  proximity  and  medical  treatment  quality  between  the  role  3  and  role 
2  medical  treatment  facilities  in  this  example.  The  relative  utilities  and  travel  times  between  these 
two  types  of  facilities  are  pertinent  when  managing  military  medical  evacuation  system  logistics. 
Therefore,  define  the  distance  ratio  6  as  the  relative  travel  distance  to  the  role  3  medical  treatment 
facility  as  compared  to  the  role  2  medical  treatment  facility  for  each  call  location  i  and  responding 
MEDEVAC  asset  j: 

Q  =  i  =  1,2, . .  .n,  J  =  1,2,. .  .m. 

When  9  =  2,  the  relative  transport  time  to  the  role  3  medical  treatment  facility  is  twice  the  relative 
transport  time  to  the  role  2  medical  treatment  facility,  given  the  same  MEDEVAC  asset  j  and 
demand  location  i. 

Define  the  reward  ratio  T  to  distinguish  between  the  system  utility  received  transporting  a  truly 
high-priority  casualty  to  the  role  3  medical  treatment  facility  and  the  utility  received  transporting 
a  truly  high-priority  casualty  to  the  role  2  medical  treatment  facility. 

r  =  — ^ - ,  z  =  1,2, . . .  ,n,  J  =  1,2,. . .  ,m. 

When  r  =  2,  the  role  3  medical  treatment  facility  is  twice  as  medically  capable  as  the  role  2  medical 
treatment  facility,  due  to  better  resources,  surgeons  on  staff,  etc. 

Table  1  reports  the  average  utility  when  transporting  to  the  role  2  medical  treatment  facility 
or  the  role  3  medical  treatment  facility,  for  each  MEDEVAC  asset  j  and  casualty  location  i  under 
the  base  case  of  the  military  medical  evacuation  system  in  this  example. 

Many  of  the  transition  probabilities  depend  on  the  length  of  time  until  a  busy  MEDEVAC  asset 
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Table  1;  Average  utility  when  transporting  a  true  high-priority  casualty  to  the  role  2  medical 
treatment  facility  and  the  role  3  medical  treatment  facility 


i  =  1 

i  =  2 

i  =  3 

i  =  4 

j  =  1 

0.364 

0.364 

0.291 

0.291 

j  =  2 

0.312 

0.417 

0.312 

0.269 

j  =  3 

0.291 

0.364 

0.364 

0.291 

J  =  4 

0.312 

0.343 

0.312 

0.343 

i  =  1 

i  =  2 

i  =  3 

i  =  4 

3  =  1 

0.456 

0.456 

0.363 

0.363 

j  =  2 

0.390 

0.520 

0.390 

0.336 

j  =  3 

0.363 

0.456 

0.456 

0.363 

J  =  4 

0.390 

0.429 

0.390 

0.429 

becomes  free,  for  each  dispatching  or  transporting  action,  or  a  call  for  service  ends.  Table  2  presents 
the  average  service  and  transport  time  when  responding  to  a  call  for  service  under  the  base  case  of 
the  military  medical  evacuation  system  in  this  example. 


Table  2:  Average  service  times  and  average  transport  times  (in  hours) 


i  =  1 

i  =  2 

i  =  3 

i  =  4 

3  =  1 

0.500 

0.552 

0.574 

0.552 

j  =  2 

0.552 

0.500 

0.552 

0.574 

j  =  3 

0.574 

0.552 

0.500 

0.552 

j  =  4 

0.552 

0.574 

0.552 

0.500 

i  =  1 

i  =  2 

i  =  3 

i  =  4 

3  =  1 

0.188 

0.136 

0.188 

0.210 

J  =  2 

0.136 

0.083 

0.136 

0.157 

j  =  3 

0.188 

0.136 

0.188 

0.210 

J  =  4 

0.210 

0.157 

0.210 

0.231 

CO 

i  =  1 

i  =  2 

i  =  3 

i  =  4 

3  =  1 

0.376 

0.271 

0.376 

0.419 

j  =  2 

0.271 

0.167 

0.271 

0.315 

3=3 

0.376 

0.271 

0.376 

0.419 

j  =  4 

0.419 

0.315 

0.419 

0.462 

The  remaining  transition  probabilities  depend  on  the  rate  at  which  casualties  arrive  to  the 
system.  Calls  arrive  according  to  a  Poisson  process  with  parameter  A  =  3.0  calls  per  hour.  The 
distribution  of  calls  Pi  are  unevenly  spaced  across  the  four  casualty  locations.  There  is  a  “hotbed” 
of  activity  and  more  frequent  calls  for  service  in  location  2.  Table  3  presents  the  base  case  set  of 
input  parameters  and  the  corresponding  ranges  used  for  sensitivity  analysis  in  this  example. 

All  computations  are  performed  on  dual  servers  with  Quad-Core  3.00  GHz  processors  and 
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Table  3:  Input  parameters  and  ranges  considered  for  sensitivity  analysis 


Input  parameter 

Base  case  value 

Parameter  range 

Medical  treatment  facilities  d 

2 

- 

Gasualty  locations  n 

4 

- 

Air  MEDEVAG  assets  m 

4 

- 

Glassified  risk  levels  R 

2 

- 

Reward  Ratio  P 

1.25 

[1,1.5] 

Distance  Ratio  9 

2 

[2,6] 

Triage  accuracy  a 

10® 

[1,10®] 

Gall  arrival  rate  per  hour  A 

3.0 

[1.5,3.25] 

Gasualties  in  a  batch  X 

1 

- 

Probability  of  casualty  at  each  location  Pi 

[0.25  0.50  0.10  0.15] 

- 

16GB  RAM.  To  solve  the  Markov  decision  process  model  (see  Puterman,  1994)  a  value  iteration 
convergence  algorithm  with  tolerance  of  10“^  is  used.  The  value  iteration  convergence  algorithm 
is  presented  in  Appendix  A.  The  run  time  for  the  value  iteration  algorithm  is  approximately  250 
minutes  for  the  base  case  model  with  83,521  states  and  30  replications  of  a  simulation  for  10,000 
casualties  has  a  run  time  of  approximately  18  minutes. 


4.2  Policy  Comparison 

The  optimal  Markov  decision  process  solution  is  compared  to  three  heuristic  policies:  1)  transport 
all  casualties  to  the  most  rewarding  medical  treatment  facility  2)  transport  all  casualties  to  the 
closest  medical  treatment  facility  and  3)  transport  low-priority  casualties  to  the  closest  medical 
treatment  facility  and  transport  high-priority  casualties  to  the  most  rewarding  medical  treatment 
facility.  All  three  heuristics  dispatch  the  closest  available  server.  Figure  3  compares  the  objective 
function  value  of  the  MDP  to  the  performance  of  the  three  heuristics.  The  objective  function 
values  are  rescaled  so  that  values  reflect  the  average  modified  Golden  Hour  utility  received  per 
casualty.  Insights  to  be  gained  from  figure  3  include  the  magnitude  of  improvement  in  system 
performance  from  leveraging  optimization  techniques  versus  heuristic  policies.  The  optimal  policy 
yields  solution  values  that  are,  on  average,  4.55%,  21.32%,  and  0.72%  better  than  heuristics  1,  2, 
and  3,  respectively,  as  the  distance  ratio  9  increases.  It  is  also  of  note  that  when  the  distance  ratio 
is  small,  e.g.  9  =  2,  the  MDP  only  mildly  outperforms  heuristic  3  and  heuristic  1.  However,  as  the 
distance  ratio  9  increases,  the  margin  in  which  the  MDP  outperforms  the  best  heuristic  increases. 
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— 0— 
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Figure  3:  Simulated  policy  comparison 

The  loss  rate  is  defined  as  the  percentage  of  calls  for  service  lost  within  the  military  medical 
evacuation  system,  due  to  the  system  being  overcrowded  with  no  available  air  MEDEVAC  assets. 
Eor  the  MDP  model,  under  the  base  case  with  A  =  3.0  and  6  =  2,  the  loss  rate  is  12.97%. 
Likewise,  when  0  =  Q  the  loss  rate  is  16.38%.  In  practice,  the  so-called  “lost”  calls  are  delegated 
to  non-traditional  MEDEVAC  assets  so  that  all  casualties  receive  timely  service.  Military  medical 
evacuation  systems  differ  from  their  civilian  counterparts  in  that  every  effort  is  made  to  keep  the 
queue  for  service  at  zero  (Bozell,  2013). 

4.3  Dispatching  and  Transporting  Sensitivity 

The  MDP  model  optimizes  over  both  the  dispatching  and  transporting  decisions,  and  therefore 
it  is  of  interest  to  know  when  to  transport  casualties  to  the  different  medical  treatment  facilities. 
Figures  4(a)  -  4(c)  illustrate  the  sensitivity  of  the  proportion  of  high-priority  casualties  delivered 
to  the  role  3  medical  treatment  facility  as  a  function  of  the  distance  ratio  9  for  different  F,  A,  and  a 
values.  A  main  insight  of  military  medical  evacuation  systems,  seen  in  figure  4(a),  is  the  impact 
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of  the  reward  ratio  T  on  the  proportion  of  casualties  transported  to  the  more  rewarding  facility. 
Specifically,  when  the  role  3  medical  treatment  facility  is  50%  better  than  the  role  2  medical 
treatment  facility  (F  =  1.5),  all  high  priority  casualties  are  transported  to  the  more  rewarding  role 
3  medical  treatment  facility,  regardless  of  whether  the  role  3  medical  treatment  facility  is  close  or 
far.  Figure  4(b)  provides  insights  on  how  the  call  arrival  rate  A  effects  the  system.  An  increase 
in  A  ffoods  the  system  with  more  casualties  and  more  high  priority  casualties  are  transported  to 
the  role  2  medical  treatment  facility  that  is  closer,  so  that  servers  can  end  service  and  be  available 
to  respond.  It  is  optimal  to  deliver  almost  all  true  high-priority  casualties  to  the  role  3  medical 
treatment  facility  unless  the  role  3  medical  treatment  facility  is  very  distant  and  there  are  few 
marginal  benefits  or  high  call  volume.  In  sum,  these  results  suggest  that  a  heuristic  that  transports 
truly  high-priority  casualties  to  the  medical  treatment  facility  with  the  higher  utility  (unless  it  is 
extremely  distant)  and  truly  low-priority  casualties  to  the  nearest  medical  treatment  facility,  as 
done  by  Heuristic  3,  can  be  used  be  military  decision  makers  as  a  near  optimal  transportation 
policy.  A  heuristic  transport  policy  has  the  added  benefit  of  greatly  reducing  the  state  space  to 
(1  -|-  nN)'^  states,  which  helps  to  improve  model  scalability.  However,  it  is  less  clear  which  asset 
to  send  upon  initial  dispatch.  The  dispatch  decision  is  largely  responsible  for  the  difference  in 
performance  between  the  optimal  policy  and  Heuristic  3  (see  Figure  3),  and  therefore,  we  examine 
this  issue  next. 

Table  4  presents  the  proportion  of  classified  high-priority  and  low-priority  calls  to  whom  the 
closest  MEDEVAC  asset  is  dispatched,  which  captures  system  insights  on  whether  to  send  the 
closest  server  or  ration  it  instead.  We  note  that  the  closest  MECVAC  asset  is  not  always  available, 
so  it  is  impossible  for  these  values  to  be  equal  to  1.0.  We  examine  this  decision  across  different 
levels  of  triage  accuracy  from  a  worst-case  lower  bound  a  =  1  to  a  =  100.  Consider  the  classified  H 
casualties  in  Table  4.  The  general  insight  we  gain  is  that  the  frequency  in  which  the  closest  server 
is  dispatched  decreases  for  locations  3  and  4,  and  increases  for  locations  1  and  2.  The  model  is 
accounting  for  the  call  location  distribution  Pi,  where  location  1  and  2  have  the  greatest  probability 
of  a  call  for  service,  and  reducing  the  response  time  to  classified  H  calls  for  service  in  location  1 
and  2.  Reducing  the  response  time  by  sending  the  closest  server  allows  the  servers  to  finish  service 
and  become  available  sooner  for  the  additional  calls  expected  in  location  1  and  2. 
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(b)  A  sensitivity 


(c)  a  sensitivity 


Figure  4:  Sensitivity  analysis  on  the  proportion  of  true  high-priority  casualties  H'  transported  to 
the  role  3  medical  treatment  facility,  with  respect  to  distance  ratio  6 
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In  a  similar  manner,  Table  4  also  presents  the  closest  MEDEVAC  asset  dispatching  frequency 
for  classified  low-priority  casualties.  As  ol  increases,  each  MEDEVAC  asset  generally  responds  to 
fewer  calls  in  its  “home”  location.  Therefore,  when  information  is  more  accurate  (i.e.,  when  a  is 
large),  the  system  “saves”  some  MEDEVAC  assets  for  responding  to  nearby  truly  high-priority 
casualties  by  strategically  sending  more  distant  MEDEVAC  assets  to  low-priority  casualties.  This 
suggests  that  as  classification  accuracy  improves,  it  is  optimal  to  ration  MEDEVAC  assets  in  areas 
with  the  largest  rate  of  truly  high-priority  calls. 

Table  4;  Proportion  of  calls  for  service  at  each  location  that  are  responded  to  by  the  closest 
MEDEVAC  asset _ 


e 

a 

Location  i 

classified  H  calls 

classified  L  calls 

e  =  2 

a  =  \ 

1 

0.506 

0.506 

2 

0.447 

0.447 

3 

0.503 

0.503 

4 

0.555 

0.555 

a  =  10 

1 

0.516 

0.416 

2 

0.535 

0.071 

3 

0.460 

0.456 

4 

0.484 

0.480 

a  =  100 

1 

0.534 

0.256 

2 

0.535 

0.069 

3 

0.461 

0.430 

4 

0.452 

0.461 

II 

a  =  1 

1 

0.434 

0.434 

2 

0.388 

0.389 

3 

0.422 

0.422 

4 

0.478 

0.478 

a  =  10 

1 

0.447 

0.349 

2 

0.463 

0.077 

3 

0.409 

0.406 

4 

0.431 

0.440 

a  =  100 

1 

0.457 

0.237 

2 

0.464 

0.076 

3 

0.415 

0.347 

4 

0.415 

0.411 

We  further  study  the  impact  of  the  initial  dispatch  decision  on  later  transport  decisions  by 
examining  whether  the  transport  decisions  depend  on  the  MEDEVAC  asset  dispatched.  Recall  that 
the  MEDEVAC  asset  dispatched  to  a  call  for  service  also  transports  the  casualty.  Table  5  shows 
the  proportion  of  true  high-priority  casualties  that  are  transported  to  role  3  medical  treatment 
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Table  5:  Proportion  of  true  high-priority  casualties  transported  to  role  3  medical  treatment  facility 
based  on  the  responding  MEDEVAC  asset 


e 

a 

i 

Closest  MEDEVAC 
asset  responds 

More  distant  MEDEVAC 
asset  responds 

2 

a  =  1,  10,  and  100 

1 

1.000 

1.000 

2 

1.000 

1.000 

3 

1.000 

1.000 

4 

1.000 

1.000 

6 

a  =  1 

1 

0.541 

0.249 

2 

1.000 

0.780 

3 

0.809 

0.231 

4 

0.480 

0.037 

a  =  10 

1 

0.560 

0.139 

2 

1.000 

0.761 

3 

0.775 

0.145 

4 

0.429 

0.042 

a  =  100 

1 

0.563 

0.105 

2 

1.000 

0.756 

3 

0.751 

0.107 

4 

0.317 

0.053 

facility  in  two  cases:  when  the  closest  MEDEVAC  asset  responds  and  when  further  MEDEVAC 
assets  respond.  Here,  we  see  that  when  0  =  2  all  responding  MEDEVAC  assets  transport  casualties 
to  the  role  3  medical  treatment  facility,  both  when  dispatch  classification  is  poor  (a  =  1)  and 
when  dispatch  classification  is  better  (a  =  100).  Also,  when  the  role  3  medical  treatment  facility 
is  further  and  6  =  6,  we  see  that  more  distant  responding  MEDEVAC  assets  are  less  likely  to  later 
transport  truly  high-priority  casualties  to  role  3  medical  treatment  facilities.  There  is  less  incentive 
(in  terms  of  the  Golden  Hour  performance  measure)  to  transport  casualties  to  the  role  3  medical 
treatment  facility  when  more  distant  MEDEVAC  assets  respond  to  a  casualty.  Tables  4  and  5 
together  shed  light  on  how  the  revelation  of  information  affects  decisions  throughout  the  treatment 
and  delivery  of  each  casualty.  In  particular,  when  there  are  more  initial  classification  errors,  distant 
assets  are  more  likely  to  respond  to  H'  casualties,  who  are  then  less  likely  to  be  transported  to 
medical  treatment  facilities  with  the  best  capabilities. 
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5  Conclusions 


This  paper  models  and  analyzes  optimal  dispatching  and  transporting  policies  in  military  medical 
evacuation  systems.  Timely  transportation  of  casualties  motivates  the  need  to  examine  how  to  make 
better  interrelated  decisions — how  to  dispatch  MEDEVAC  assets  to  casualties  and  then  transport 
casualties  to  medical  treatment  facilities — given  the  revelation  of  information  over  the  duration  of 
each  call.  An  undiscounted,  infinite  horizon,  average-cost  MDP  model  is  formulated  to  identify 
optimal  policies,  which  is  solved  using  a  value  iteration  algorithm.  In  the  computational  example, 
a  situation  where  two  medical  treatment  facilities  are  distinguishable  by  both  their  proximity  to 
calls  for  service  (distance)  and  treatment  capability  (reward)  is  considered.  Each  dispatching  and 
transporting  decision  effects  system  resources  being  busy  or  available  to  respond  to  additional  calls 
for  service.  Optimal  decision  policies  utilize  the  better  role  3  medical  treatment  facility  with  varying 
frequency,  as  system  input  parameters  such  as  call  volume  and  dispatcher  classification  ability  are 
varied.  The  optimal  policy  outperforms  three  heuristics  considered  in  this  paper  on  average  by 
4.55%,  21.32%,  and  0.72%,  respectively.  The  initial  dispatch  decisions  account  for  much  of  the 
improvement  over  the  heuristic  policies.  The  computational  results  suggest  that  for  most  settings, 
a  heuristic  policy  could  be  used  for  the  transport  decisions,  which  would  greatly  reduce  the  state 
space  and  improve  model  scalability. 

Future  work  will  focus  on  the  locating  of  two  types  of  MEDEVAC  assets,  such  as  air  assets  and 
ground  assets.  Another  extension  is  to  consider  co- locating  multiple  types  of  dependent  military 
assets,  such  as  a  MEDEVAC  asset  and  a  security  escort  asset,  to  dispatch  one  unit  of  each  type  in 
tandem  to  a  casualty  incident.  A  bi-objective  model  for  balancing  casualty  Golden  Hour  coverage 
levels  and  risk  tolerance,  such  as  found  in  risky  evacuation  missions.  Work  is  also  under  way  to 
address  these  extensions. 
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Appendices 


A  Value  Iteration  Convergence  Algorithm 


To  solve  for  the  optimal  policy,  the  relative  value  function  algorithm  (see  Puterman,  1994)  is  run 
using  the  finite-horizon  value  functions.  Therefore,  t  is  the  iteration  here,  not  time.  To  do  so,  define 
utis{t))  as  the  value  of  being  in  state  s{t)  during  iteration  t,  for  t  =  0, . . . ,  V  —  1,  and  uo{s{t))  =  0 
for  all  s{t)  G  S. 


Ut+l{s{t))  =- 
7 


E  E  =  0)) 

Lfe  — 1 j  —  l  i—l 
n  N 


(2a) 

(2b) 


+  EE  E  PR\i  max  {ct(5'*^(s(t)|sj  =  (i,i?,X)))} 

n  N  m 

+EEE  E  E  Tnfflni  max  {s{t)\sj  =  Dk))  +  ■yufjkr} 

X  =  1  j^l  Re{H,L}  rG{H',L'}  2(s()) 

(2c) 
(2d) 


n  N  m 


n  m  d 


(Mu)  -  EEE(^  ijk)  1  Rt{s{i)) 

\  i=\  X^l  j=l  R£{H,L}  i=l  j  =  l  fe=l  j 


To  achieve  the  optimal  policy,  the  relative  value  iteration  algorithm  is  run  until  the  upper  and 
lower  bounds  converge  to  the  optimal  average  utility  per  stage  g 


Lt  <  Lt+i  <  g  <  Ut+i  <  Ut, 


with  lower  bound 

Lt  =  mm  U+i{s{t))  -  Msit))] 
s{t)&S 

and  upper  bound 


Ut  =  max  [r'4+i(s(t))  -  ut{s{t))]. 
s{t)es 

The  value  iteration  algorithm  is  executed  until  Ut+i  —  Ti+i  <  e,  for  a  given  e. 


25 


B  Theoretical  Results 


We  consider  the  finite  stage  optimality  equations  and  consider  the  limit.  The  N-stage  case  MDP 
equations  are  in  Appendix  A.  Note  that  in  section  3,  equation  la  -  Id  capture  the  exact,  inhnite 
horizon,  average  cost  optimality  equations.  In  contrast,  equation  2a  -  2d  capture  the  optimality 
equations  for  the  hnite-horizon  case. 

Next  we  exploit  the  finite  case  optimality  equations  to  analyze  the  MDP  structural  properties. 
The  first  lemma  shows  that  it  is  always  optimal  to  choose  to  transport  casualties  to  the  more 
rewarding  medical  treatment  facilities  when  two  when  two  medical  treatment  facilities  have  the 
same  expected  transport  time.  This  suggests  that  transporting  to  the  closest  facility  is  optimal. 

Lemma  1.  Suppose  a  MEDEVAC  asset  j  finishes  service  at  location  i  and  needs  to  transport  a 
casualty  with  risk  level  r  to  one  of  two  medical  treatment  facilities,  labeled  as  1  and  2,  r  G  {H',L'}. 
If  both  facilities  have  the  same  expected  transport  time,  i.e.,  6iji  =  5ij2  and  facility  1  has  a  higher 
utility  than  facility  2,  i.e.,  u^^^.  >  then  it  is  always  better  to  deliver  to  the  facility  with  the 

highest  utility. 

Proof.  Without  loss  of  generality,  assume  that  the  system  is  in  state  s{t)  with  value  t't(s(f)).  The 
set  of  available  transport  decisions  here  are  A2{s{t))  =  {Di,D2},  which  correspond  to  facilities  1 
and  2.  Let  si{t  +  1)  and  S2{t  +  1)  denote  the  states  when  facilities  1  and  2  are  selected,  respectively 
(see  (Ic)).  It  is  sufficient  to  show  that  i't{si{t  +  1))  +  >  vt{s2{t  +  1))  +  7u^2r-  Note  that  in 

this  case,  the  state  in  place  j  corresponding  to  asset  j  moves  into  the  same  transport  state  (i.e., 
Sj  =  D  whether  medical  treatment  facility  1  or  2  is  selected.  Then,  we  can  rearrange  this  to  obtain 

lufjlr  -  'yufj2r  >  Vt{s2{t  +  1))  -  t't(si(t  +  1)). 

Since  5iji  =  5ij2  for  i  =  I,  ...,n,  j  =  1,  ...,m,  then  the  value  functions  in  these  two  states  entirely 
cancel,  yielding  —  7u^2r  —  which  is  true  since  7  >  0  and  >  u^2r  ...,  n,  j  = 

1, ...,  m,  r  G  {H',  L'}.  □ 

The  next  proposition  shows  that  the  average  utility  per  stage  is  higher  in  a  state  when  a 
MEDEVAC  is  available  as  compared  to  when  it  is  busy  transporting  a  casualty. 
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Proposition  1.  Let  state  s{t)  be  a  state  where  server  j  is  available,  i.e.,  Sj{t)  =  0.  Let  state  s{t) 

be  the  corresponding  state  where  server  j  is  transporting  a  casualty,  i.e.,  Sj{t)  =  Dk  for  some  k  and 

si{t)  =  si{t),  I  =  1,  ...,m  and  I  /  j.  Then  utis{t))  —  i't{s{t))  >  0  for  all  t  >  0. 

Proof.  The  claim  will  be  shown  by  induction.  First,  note  that  the  claim  is  trivially  true  for  t  =  0 
since  r't(s(0))  =  0  for  all  states  s(0).  Let  s{t)  and  s{t  +  1)  denote  identical  states  at  different  times. 
After  some  rearranging: 


7(t't+l(s(i  +  1))  -  I't+iiHt  +  1))  = 

d  m  n 

E  =  0))  -  =  0))) 

fe  =  li  =  l  i=l 

n  N  ,  ^ 

+  E  E  E  {s{t)\sj  =  {i,R,X)))}  -  max  =  (i,  i?,  X)))}  ) 


f=l  Re{H,L} 

n  N  m  p 

+E  E  E  E  E  =  Pfe)) +T'Efcr}- 

i=l  x  =  l  j  =  l  R£{H,L}  r£{H',L'}  ^^3 


•C*fc€A2(s(t)) 


max  (^)){!/t(S"(s(t)|sj  =  Dfc))  + 


+  ( 7 ~  ^ ~ E  E  E  E  -^{sj={i.fl,v)}  ~ E E —  ut{s{t))). 

i=l  X  =  1  j  =  l  Rl={H,L}  i  =  lj  =  lk=l 


(3a) 

(3b) 


Note  that  in  line  (3a),  the  set  of  actions  in  Ai(s(t))  is  a  subset  of  those  in  state  Ai((s(t  +  1)). 

Let  j*  =  argmax{Ai(.s(t  +  1))}.  We  can  bound  the  expression  above  from  below  by  setting  both 
j 

decisions  in  (3a)  to  j*.  Likewise,  we  can  apply  this  same  idea  to  the  actions  in  A2{s{t))  selected  in 

both  maximizations  in  (3b).  Let  d*  =  argmax{A2(s(t  +  1))},  and  set  the  destination  in  the  first 

Dk 

maximization  to  d* .  Then, 


7(l/t  +  l(s(t  +  1))  -  !/t+l(s(t  +  1))  = 


d  m  n 

EEEEv)-^T».=n,}(^'d5"(s(^)|.,■  =0))-i.dS"(s(t)|^,-  =  0))) 

k=l j=l i=l 

n  N 

+  E  E  E  =  (i,  R,  X)))}  -  {ut{S^ {smr  =  (i,  R,  ^)))}) 

i=l  X  =  1  R^{H,L} 


n  N  m 


^r\Rr\i 


-EEE  E  E 

i=\X  =  lj  =  lR(^{H,L}r£{H',L'}  '"O 


hs,=(i.R,x)}  {Rt(S^(s(t)\si  =  d*))  -  ut{S^(m\sj  =  d*))) 


+  (7-a-EEE  e  (Vfj)  ^I{sj=(i,R,X)}  -  EEEEfe)  )  d't{s{t))-vt(sd)))- 

i=l  X=1  j=l  R<^{H,L}  i=lj=lfe=l 


Here,  all  four  lines  are  non-negative  using  by  the  induction  assumption.  Therefore,  the  claim  is 
true. 

□ 
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The  second  lemma  shows  that  it  is  always  optimal  to  choose  to  transport  casualties  to  the 
“closer”  medical  treatment  facility  if  both  facilities  have  the  same  utility,  where  a  facility  is  “closer” 
to  a  casualty  at  location  i  with  asset  j  if  its  expected  transport  time  is  smaller, 

Lemma  2.  Suppose  in  state  s{t)  a  MEDEVAC  asset  finishes  service  at  the  scene  and  needs  to 

transport  a  casualty  with  risk  level  r  to  one  of  two  medical  treatment  facilities,  labeled  as  1  and 

2,  r  ^  {H',L'}.  If  both  facilities  have  the  same  utility,  i.e.,  =  uf^2r  Z®’’  ^  ~  j  = 

r  G  {H',L'},  and  the  expected  transport  time  is  shorter  for  facility  1,  i.e.,  5iji  <  5ij2  for 

i  =  j  =  l,...,m,  then  utiS^ {s{t)\sj  =  Di))  —  utiS^ {s{t)\sj  =  D2))  >  0  for  all  t  >  0  and 

it  is  always  better  to  deliver  to  the  facility  with  the  smaller  expected  service  time. 

Proof.  The  claim  will  be  shown  by  induction.  First,  note  that  the  claim  is  trivially  true  for  t  =  0 
since  t't(s(0))  =  0  for  all  states  s(0).  We  assume  that  nt{sl{t))  —  nt{s2{t))  >  0  for  all  states  si  and  s2 
that  are  identical  .  Let  MEDEVAC  asset  j*  be  the  asset  that  must  transport  casualties  to  a  medical 
treatment  facility.  Let  the  state  sl(t  +  l)  =  S^{s{t)\sj  =  Di)  and  let  s2(t  +  l)  =  S^{s{t)\sj  =  D2). 
Next,  after  some  rearranging; 


+  1))  -  ut{s2{t  +  1)))  = 


m  n 


n  N 

+EE  E 

i  =  l  X  =  1  R^{H,L) 

n  N  m 


X 

R\i 


E  E  =  0))  -  ut{S^{s2it)\s2j  =  0))) 

fc=l  j  =  i  =  l 

max  =  (i,R,X)))}  -  max  {iyt(S^  (s2(t)ls2j  =  (i,  R,  X)))})  (4a) 

jeAi(si(t))  jGAi(s2(t))  y 


+  EEE  E  E  =  Dk))  +  7uf.f^J-  (4b) 


i  =  l  X  =  li  =  l  re{ir',L'} 


n  (s2(t)ls2j  =  Dfe))  + 

DkeA2(s2(t)) 


+  (^-^-EE  E  E  iufj)  ^I{s,j=(i,R,X)}  -  EEE('5ufe)“'T«.=n.} ) 

i=l  X  =  1  j=l,j^j*  R^{H,L}  i=l  j=l  k=l 


+  (('5ui)  ^  u(s{t  +  1))  -  {(Siji)  ^!2(sl(t))  +  (5ij2)  ^I'{s2{t)))  . 


As  in  Proposition  1,  let  j*  =  argmax{Ai(s2(t))}.  We  can  bound  the  expression  above  from  below 

j 

by  setting  both  decisions  in  (4a)  to  j* .  Likewise,  we  can  apply  this  same  idea  to  the  actions  selected 

in  both  maximizations  in  (4b).  Let  d*  =  argmax{A2(s2(t))},  and  set  the  destination  in  the  first 

Dk 

maximization  to  d* .  Moreover,  the  last  line  can  be  rearranged  to  yield; 


(5iil)  nz^(«0(t))  -  12{sl{t)))  -  {6ij2)  \i2{s0{t))  -  i2{s2{t))) 
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after  noting  that  sO(t)  =  S'^(sl(t)|slj  =  0)  =  {s2{t)\s2j  =  0).  Moreover,  we  can  bound  this 

below  by  applying  the  induction  assumption,  with 


This  yields: 


y{ut{sl(t  +  1))  -  ut(s2{t  +  1)))  > 


d  m  n 


E  E  =  0))  -  ut{S^{s2{t)\s2^  =  0))) 

A:  =  l  j  =  l,jj^j*  i=l 

n  N 

+  E  E  E  {MS^ ism\sl,,  =  ii,R,X)))-utiS^is2it)\s2j.  =  ii,R,X)))) 

i=l  X=1  Eg{H,L} 


n  N  m 


-EEE  E  E  ^ 

i=l  X  =  1  3  =  1  R^{H,L}  r£{H',L'}  ^^3 


-T3i.=U.fl.v)}  =  d*))  -  utiS^is2it)\s2j  =  d*))) 


n  N  m 


+  h-"-EE  E  E  {Rij)  ^d{sj  =  {i,R,X)}  -  EEE  (dijk)  ^I{s^=D,,}  1  - '^t(«2(t))) 

V  i=l  V=li=l,i7^j*  i=lj  =  lk=\  j 

+  ((5iii)"^  -  (Sij2)~'^)  (R{s0(t))  -  iy{sl{t))). 


The  first  four  lines  are  greater  than  or  equal  to  zero  by  the  induction  assumption.  The  last  line  is 
greater  than  or  equal  to  zero  by  Proposition  1  and  by  noting  that  “  (E'2)~^  — 
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